frailtyPenal for Nested frailty models: Fit Nested Frailty model using penalized likelihood estimation

Description

Fit a nested frailty model using a Penalized Likelihood on the hazard function. Nested frailty models allow survival studies for hierarchically clustered data by including two iid gamma random effects. Left truncated and censored data are allowed. Stratification analysis is allowed(maximum of strata=2).

The hazard function conditional on the two frailties $v_i$ and $w_{ij}$ for the $k^{th}$ individual of the $j^{th}$ subgroup of the $i^{th}$ group is :

$$\lambda_{ijk}(t|v_i,w_{ij})=v_iw_{ij}\lambda_0(t)exp(\bold{\beta^{'}X_{ijk}})$$

$$\small{ v_i\sim\Gamma\left(\frac{1}{\alpha},\frac{1}{\alpha}\right) \hspace{0.05cm}i.i.d. \hspace{0.2cm} \bold{E}(v_i)=1 \hspace{0.2cm}\bold{Var}(v_i)=\alpha \hspace{0.5cm} w_{ij}\sim\Gamma\left(\frac{1}{\eta},\frac{1}{\eta}\right)\hspace{0.05cm}i.i.d. \hspace{0.2cm} \bold{E}(w_{ij})=1 \hspace{0.2cm}\bold{Var}(w_{ij})=\eta}$$

where $\lambda_0(t)$ is the baseline hazard function, $X_{ijk}$ denotes the covariate vector and $\beta$ the corresponding vector of regression parameters.

Arguments

formula

a formula object, with the response on the left of a $\texttildelow$ operator, and the terms on the right. The response must be a survival object as returned by the 'Surv' function like in survival package. The subcluster()

formula.terminalEvent

Not required.

data

a 'data.frame' in which to interpret the variables named in the 'formula'.

Frailty

Logical value. Is model with frailties fitted? If so, variance of frailty parameter is estimated. If not, Cox proportional hazards model is estimated using Penalized Likelihood on the hazardfunction. The default is FALSE.

joint

Not required

recurrentAG

Logical value. Is Andersen-Gill model fitted? If so indicates that recurrent event times with the counting process approach of Andersen and Gill is used. This formulation can be used for dealing with time-dependent covar

cross.validation

Logical value. Is cross validation procedure used for estimating smoothing parameter? If so a search of the smoothing parameter using cross validation is done, with kappa1 as the seed. The cross validation is not implemen

n.knots

integer giving the number of knots to use. Value required. It corresponds to the (n.knots+2) splines functions for the approximation of the hazard or the survival functions. Number of knots must be between 4 and 20.(See Note)

kappa1

positive smoothing parameter. The coefficient kappa of the integral of the squared second derivative of hazard function in the fit (penalized log likelihood). To obtain an initial value for kappa1 (or kappa2), a soluti

kappa2

positive smoothing parameter for the second stratum, when data are stratified. See kappa1.

maxit

maximum number of iterations for the Marquardt algorithm. Default is 350

Value

a Nested frailty model or more generally an object of class 'frailtyPenal'. Methods defined for 'frailtyPenal' objects are provided for print, plot and summary. The following components are included in a 'frailtyPenal' object for nested frailty models.
alphavariance of the cluster effect $(\bold{Var}(v_{i}))$
callThe code used for fitting the model
coefthe coefficients of the linear predictor, which multiply the columns of the model matrix.
cross.ValLogical value. Is cross validation procedure used for estimating the smoothing parameters?
DoFdegrees of freedom
etavariance of the subcluster effect $(\bold{Var}(w_{ij}))$
formulathe formula part of the code used for the model
groupsthe maximum number of groups used in the fit
subgroupsthe maximum number of subgroups used in the fit
kappaA vector with the smoothing parameters corresponding to each baseline function as components
lammatrix of hazard estimates at x1 times and confidence bands.
lam2the same value as lam for the second stratum
nthe number of observations used in the fit.
n.eventsthe number of events observed in the fit
n.iternumber of iterations needed to converge
n.knotsnumber of knots
n.stratA vector with the number of covariates of each type of hazard function as components
nvarA vector with the number of covariates of each type of hazard function as components
survmatrix of baseline survival estimates at x1 times and confidence bands.
surv2the same value as surv for the second stratum
typea character string specifying the type of censoring. Possible values are "right", "left", "counting", "interval", "interval2". The default is "right" or "counting" depending on whether the 'time2' argument is absent (not interval-censored data) or present (interval-censored data), respectively.
varHthe variance matrix of all parameters (alpha, eta, the regression coefficients and the spline coefficients).
varHIHthe robust estimation of the variance matrix of all parameters (alpha, eta, the regression coefficients and the spline coefficients).
x1vector of times where both survival and hazard function are estimated. By default seq(0,max(time),length=99), where time is the vector of survival times.
x2the same value as x1 for the second stratum

synopsis

frailtyPenal(formula, formula.terminalEvent, data, Frailty = FALSE, joint = FALSE, recurrentAG = FALSE, cross.validation = FALSE, n.knots, kappa1, kappa2, maxit = 350)

Details

The estimated parameter are obtained by maximizing the penalized loglikelihood using the robust Marquardt algorithm (Marquardt, 1963) which is a combination between a Newton-Raphson algorithm and a steepest descent algorithm. When frailty parameter is small, numerical problems may arise. To solve this problem, an alternative formula of the penalized log-likelihood is used (see Rondeau, 2003 for further details). Cubic M-splines of order 4 are used for the hazard function, and I-splines (integrated M-splines) are used for the cumulative hazard function. The smoothing parameter can be fixed a priori or chosen by maximizing a likelihood cross validation criterion. The iterations are stopped when the difference between two consecutive loglikelhoods was small $(<10^{-4})$, 10="" the="" estimated="" coefficients="" were="" stable="" (consecutive="" values="" $(<10^{-4})$,="" and="" gradient="" small="" enough="" $(<10^{-6})$.="" to="" be="" sure="" of="" having="" a="" positive="" function="" at="" all="" stages="" algorithm,="" spline="" reparametrizes="" each="" stage.="" variance="" space="" two="" random="" effects="" is="" reduced,="" so="" variances="" are="" positive,="" correlation="" coefficient="" constrained="" between="" -1="" 1.="" integrations="" in="" full="" log="" likelihood="" werre="" evaluated="" using="" gaussian="" quadrature.="" laguerre="" polynomials="" with="" points="" used="" treat="" on="" $[0,\infty[$.<="" p="">

INITIAL VALUES

The splines and the regression coefficients are initialized to 0.1. The program fits an adjusted Cox model to provide new initial values for the regression and the splines coefficients. The variances of the frailties are initialized to 0.1. Then, a shared frailty model with covariates with only subgroup frailty is fitted to give a new initial value for the variance of the subgroup frailty term. Then, a shared frailty model with covariates and only group frailty terms is fitted to give a new initial value for the variance of the group frailties. In a last step, a nested frailty model is fitted.

PARAMETERS LIMIT VALUES

As frailtypack is written in Fortran 77 some parameters had to be hard coded in. The default values of these parameters are, with the corresponding variable name in the fortran code between brackets.

maximum number of observations (ndatemax): 30000 maximum number of groups (ngmax): 1000 maximum number of subjects (nsujetmax): 15000 maximum number of parameters (npmax) :50 maximum number of covariates (nvarmax):50 maximum number of subgroups (nssgmax):5000 If these parameters are not large enough (an error message will let you know this), you need to reset them in nested.f and recompile.

References

V. Rondeau, L. Filleul, P. Joly (2006). Nested frailty models using maximum penalized likelihood estimation. Statistics in Medecine, 25, 4036-4052.

V. Rondeau, D Commenges, and P. Joly (2003). Maximum penalized likelihood estimation in a gamma-frailty model. Lifetime Data Analysis 9, 139-153.

D. Marquardt (1963). An algorithm for least-squares estimation of nonlinear parameters. SIAM Journal of Applied Mathematics, 431-441.

Examples

Run this code

### Nested model (or hierarchical model) with 2 covariates ###


data(dataNested)
  modClu<-frailtyPenal(Surv(t1,t2,event)~cluster(group)+subcluster(subgroup)+
                  cov1+cov2,Frailty=TRUE,data=dataNested,n.knots=8,kappa1=50000)


# It takes around 24 minutes to converge (depends on the processor)#

  print(modClu)
  summary(modClu)
  plot(modClu)
  
  modClu.str<-frailtyPenal(Surv(t1,t2,event)~cluster(group)+subcluster(subgroup)+
		  cov1+strata(cov2)
                  ,Frailty=TRUE,data=dataNested,n.knots=8,kappa1=20000,kappa2=20000)
  
 # It takes around 8 minutes to converge (depends on the processor)#
 
  print(modClu.str)
  summary(modClu.str)
  plot(modClu.str)

Run the code above in your browser using DataLab